Skip to content

chore: rolling promotion dev -> main#589

Open
vasconceloscezar wants to merge 550 commits into
mainfrom
dev
Open

chore: rolling promotion dev -> main#589
vasconceloscezar wants to merge 550 commits into
mainfrom
dev

Conversation

@vasconceloscezar

@vasconceloscezar vasconceloscezar commented May 1, 2026

Copy link
Copy Markdown
Collaborator

Rolling Promotion PR

Auto-maintained rolling promotion PR from dev to main.

Process:

  • This PR is automatically created and kept open
  • Agent monitors CI status and fixes issues
  • Human reviews and merges when ready
  • Label ready-to-merge added when all checks pass

Human approval required for merge to production.


Notable changes in this promotion

This rolling batch promotes the full dev backlog (~550 commits). Highlights from recent work:

🔒 Security — scope-enforcer P1 fixes (#719)

Closed three P1 authorization gaps surfaced by the review bots:

  • Body-injection scope bypassextractLockTargets trusted the request body over path params, letting a caller authorize against one instance/chat while operating on another. Path/header targets now win over the (caller-controlled) body.
  • Missed signature enforcement + false 403s on header-scoped routes (x-omni-instance/x-omni-chat, e.g. POST /turns/close) — those targets are now extracted and enforced.

🐛 Data correctness (#721)

  • chats.settings clobber — handoff, close-contact, and clear-session paths replaced the whole settings JSONB, dropping followUpConfig/close* keys. Now merge over existing settings.
  • chat.closed event published escalated: terminal — hard closes (won/lost) were always reported as escalations. Now publishes the computed escalated.

🔄 Embedded → canonical pgserve upgrade — fully automatic & host-tooling-free (#722 — PRs #723, #724, #725, #727)

Upgrading an embedded-Postgres install to canonical pgserve used to crash-loop and dead-end (omni doctor --fix needed pg_dump/psql, which aren't bundled and were often the wrong major). Now:

Validated end-to-end in clean containers: @latest embedded install + seeded data → update → single omni doctor --fix → migrated to canonical, omni-api healthy, all entity types preserved with identical IDs (instances, agent providers, agents, routes, persons/identities, chats, participants, messages, follow-up config, automations, webhooks). Only message media blobs (auto-rehydrated from the channel) and API keys (canonical keeps its own) are intentionally not copied.

✅ Test reliability (#720 + #727 follow-on)

De-flaked load-sensitive timing tests (JourneyTracker, writeDiagnostics) that intermittently failed the full-suite gate under host CPU starvation.

@coderabbitai

coderabbitai Bot commented May 1, 2026

Copy link
Copy Markdown

Important

Review skipped

Too many files!

This PR contains 295 files, which is 145 over the limit of 150.

To get a review, narrow the scope:
• coderabbit review --type committed # exclude uncommitted changes
• coderabbit review --dir # limit to a subdirectory
• coderabbit review --base # compare against a closer base

Upgrade to a paid plan to raise the limit.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 665ffbf6-8524-4912-a127-32e078d3ebcf

📥 Commits

Reviewing files that changed from the base of the PR and between fe155b8 and ee7ab1d.

⛔ Files ignored due to path filters (5)
  • .claude/scheduled_tasks.lock is excluded by !**/*.lock, !.claude/**
  • .github/cosign.pub is excluded by !**/*.pub
  • README.md is excluded by !*.md
  • SECURITY.md is excluded by !*.md
  • bun.lock is excluded by !**/*.lock
📒 Files selected for processing (295)
  • .claude-plugin/marketplace.json
  • .env.example
  • .genie/.gitignore
  • .genie/DREAM.md
  • .genie/agents.json
  • .genie/brainstorm.md
  • .genie/brainstorms/_archived/fix-omni-bugs-243-244/DESIGN.md
  • .genie/brainstorms/_archived/fix-omni-bugs-243-244/DRAFT.md
  • .genie/brainstorms/_archived/generate-image-native/DRAFT.md
  • .genie/brainstorms/_archived/omni-docs-cleanup/DESIGN.md
  • .genie/brainstorms/_archived/omni-docs-cleanup/DRAFT.md
  • .genie/brainstorms/_archived/route-config-overrides/DESIGN.md
  • .genie/brainstorms/_archived/route-config-overrides/DRAFT.md
  • .genie/brainstorms/_archived/session-observatory/DRAFT.md
  • .genie/brainstorms/cli-360-agents-update-events-get-verbose-logs/DESIGN.md
  • .genie/brainstorms/cli-360-agents-update-events-get-verbose-logs/DRAFT.md
  • .genie/brainstorms/fix-omni-bugs-243-244/DESIGN.md
  • .genie/brainstorms/fix-omni-bugs-243-244/DRAFT.md
  • .genie/brainstorms/gupshup-channel-rewrite/DESIGN.md
  • .genie/brainstorms/gupshup-channel-rewrite/DRAFT.md
  • .genie/brainstorms/gupshup-handoff-message/DESIGN.md
  • .genie/brainstorms/gupshup-handoff-message/DRAFT.md
  • .genie/brainstorms/omni-docs-cleanup/DESIGN.md
  • .genie/brainstorms/omni-docs-cleanup/DRAFT.md
  • .genie/brainstorms/omni-vs-friend-comparison/DESIGN.md
  • .genie/brainstorms/omni-vs-friend-comparison/DRAFT.md
  • .genie/brainstorms/route-config-overrides/DESIGN.md
  • .genie/brainstorms/route-config-overrides/DRAFT.md
  • .genie/brainstorms/sentry-mcp/DRAFT.md
  • .genie/brainstorms/sentry-mcp/research-bun-sdk.md
  • .genie/brainstorms/sentry-mcp/research-codebase-mapping.md
  • .genie/brainstorms/sentry-mcp/research-privacy-and-mcp.md
  • .genie/brainstorms/sentry-mcp/research-product-features.md
  • .genie/mailbox/genie-implementor-2.json
  • .genie/mailbox/genie-implementor-3.json
  • .genie/mailbox/genie-implementor-4.json
  • .genie/mailbox/genie-implementor.json
  • .genie/mailbox/genie-review-81-v2.json
  • .genie/mailbox/genie-review-88.json
  • .genie/mailbox/genie-review.json
  • .genie/mailbox/omni-council-81.json
  • .genie/mailbox/omni-council-82.json
  • .genie/mailbox/omni-council-85.json
  • .genie/mailbox/omni-council-86.json
  • .genie/mailbox/omni-council-87.json
  • .genie/mailbox/omni-council-88.json
  • .genie/mailbox/omni-council-90.json
  • .genie/mailbox/omni-council-91.json
  • .genie/mailbox/omni-council-92.json
  • .genie/mailbox/omni-council-93.json
  • .genie/mailbox/omni-dream-81.json
  • .genie/mailbox/omni-dream-85.json
  • .genie/mailbox/omni-dream-86.json
  • .genie/mailbox/omni-dream-87.json
  • .genie/mailbox/omni-dream-88.json
  • .genie/mailbox/omni-dream-90.json
  • .genie/mailbox/omni-fix-81.json
  • .genie/mailbox/omni-fix-82.json
  • .genie/mailbox/omni-fix-85.json
  • .genie/mailbox/omni-fix-88.json
  • .genie/mailbox/omni-fix.json
  • .genie/mailbox/omni-pm-team-lead.json
  • .genie/mailbox/omni-review-86.json
  • .genie/mailbox/omni-review-87.json
  • .genie/mailbox/omni-review-90.json
  • .genie/mailbox/omni-wish-81.json
  • .genie/mailbox/omni-wish-82.json
  • .genie/mailbox/omni-wish-85.json
  • .genie/mailbox/omni-wish-86.json
  • .genie/mailbox/omni-wish-87.json
  • .genie/mailbox/omni-wish-88.json
  • .genie/mailbox/omni-wish-90.json
  • .genie/mailbox/omni-wish-91.json
  • .genie/mailbox/omni-wish-92.json
  • .genie/mailbox/omni-wish-93.json
  • .genie/wishes/_SHIPPED.md
  • .genie/wishes/_archived/omni-backlog-sprint/WISH.md
  • .genie/wishes/_archived/omni-finish-line/WISH.md
  • .genie/wishes/channel-error-migration/COUNCIL-REVIEW.md
  • .genie/wishes/channel-error-migration/WISH.md
  • .genie/wishes/channel-plugin-generator/COUNCIL-REVIEW.md
  • .genie/wishes/channel-plugin-generator/WISH.md
  • .genie/wishes/chat-attention-system/WISH.md
  • .genie/wishes/cli-360-agents-update-events-get-verbose-logs/WISH.md
  • .genie/wishes/feat-mark-online-configurable/WISH.md
  • .genie/wishes/fix-automation-pairing/WISH.md
  • .genie/wishes/fix-cli-json-reliability/WISH.md
  • .genie/wishes/fix-contacts-pushname-missing/WISH.md
  • .genie/wishes/fix-debounce-message-drops/WISH.md
  • .genie/wishes/fix-dm-context-and-quoted-truncation/WISH.md
  • .genie/wishes/fix-genie-client-autospawn-cwd/WISH.md
  • .genie/wishes/fix-group-name-null-fallback/WISH.md
  • .genie/wishes/fix-gupshup-quality-gate/WISH.md
  • .genie/wishes/fix-inbox-bridge-sanitization/WISH.md
  • .genie/wishes/fix-lid-jid-fragmentation/WISH.md
  • .genie/wishes/fix-nats-genie-reply-subscription/WISH.md
  • .genie/wishes/fix-omni-bugs-243-244/WISH.md
  • .genie/wishes/fix-omni-mini-bugs-330-336-338/WISH.md
  • .genie/wishes/fix-omni-minibugs-245-246-247/WISH.md
  • .genie/wishes/fix-person-deduplication/WISH.md
  • .genie/wishes/fix-quick-wins-344-345-335/WISH.md
  • .genie/wishes/fix-quick-wins-371-372-373/WISH.md
  • .genie/wishes/fix-server-version/WISH.md
  • .genie/wishes/fix-whatsapp-edit-long-messages/WISH.md
  • .genie/wishes/genie-session-passthrough/WISH.md
  • .genie/wishes/group-members-with-names/WISH.md
  • .genie/wishes/gupshup-channel-rewrite/WISH.md
  • .genie/wishes/gupshup-handoff-message/WISH.md
  • .genie/wishes/md-simplification/WISH.md
  • .genie/wishes/omni-agentic-cli/WISH.md
  • .genie/wishes/omni-docs-cleanup/WISH.md
  • .genie/wishes/omni-dx-quick-fixes/WISH.md
  • .genie/wishes/omni-finish-line/WISH.md
  • .genie/wishes/omni-genie-integration-v2/WISH.md
  • .genie/wishes/omni-install-resilience/WISH.md
  • .genie/wishes/omni-skills-sync/WISH.md
  • .genie/wishes/remove-baileys-logger-from-core/COUNCIL-REVIEW.md
  • .genie/wishes/remove-baileys-logger-from-core/WISH.md
  • .genie/wishes/remove-channel-leaks-from-core/WISH.md
  • .genie/wishes/route-config-overrides/WISH.md
  • .genie/wishes/sdk-compliance-test-suite/WISH.md
  • .genie/wishes/sdk-compliance-tests/COUNCIL-REVIEW.md
  • .genie/wishes/sdk-compliance-tests/WISH.md
  • .genie/wishes/sentry-integration/COUNCIL-REVIEW.md
  • .genie/wishes/sentry-integration/EXECUTION-REVIEW.md
  • .genie/wishes/sentry-integration/REVIEW.md
  • .genie/wishes/sentry-integration/WISH.md
  • .genie/wishes/standardize-sendtyping/WISH.md
  • .genie/wishes/whatsapp-labels-sync/WISH.md
  • .github/ISSUE_TEMPLATE/signing-key-fingerprint.md
  • .github/workflows/build-tarballs.yml
  • .github/workflows/ci.yml
  • .github/workflows/release-publish.yml
  • .github/workflows/release.yml
  • .github/workflows/sign-attest.yml
  • .github/workflows/signing-identity-pin.yml
  • .github/workflows/version.yml
  • .gitignore
  • .husky/pre-push
  • .well-known/security.txt
  • Jenkinsfile
  • Jenkinsfile.dev
  • Jenkinsfile.fleet-cli
  • Makefile
  • apps/ui/package.json
  • apps/ui/src/components/instances/AgentConfigForm.tsx
  • apps/ui/src/components/instances/CreateInstanceModal.tsx
  • biome.json
  • cliff.toml
  • docs/api/endpoints.md
  • docs/architecture/provider-system.md
  • docs/channel-parity/telegram-whatsapp.md
  • docs/cli/design.md
  • docs/design/cli-installer.md
  • docs/guides/openclaw-integration.md
  • docs/guides/sofia-telegram-onboarding.md
  • docs/reports/khal-smart-response-gate-qa.md
  • docs/reports/khal-whatsapp-disconnect-forensics-2026-02-12.md
  • docs/reports/tmp/khal-disconnect-anchors.txt
  • docs/reports/tmp/khal-focused.txt
  • docs/reports/tmp/khal-grep-all.txt
  • docs/reports/tmp/khal-target-timeline.txt
  • docs/research/MASTER-PLAN.md
  • docs/research/openclaw-protocol-verification.md
  • docs/sdk/auto-generation.md
  • ecosystem.config.cjs
  • install-client.sh
  • install.sh
  • knip.json
  • package.json
  • packages/api/bench/output-redactor.bench.ts
  • packages/api/package.json
  • packages/api/src/__tests__/a2a-integration.test.ts
  • packages/api/src/__tests__/auto-reply-filter.test.ts
  • packages/api/src/__tests__/consumer-config.test.ts
  • packages/api/src/__tests__/delete-channel-resolve.test.ts
  • packages/api/src/__tests__/dispatcher-first-party-sender.test.ts
  • packages/api/src/__tests__/event-persistence.test.ts
  • packages/api/src/__tests__/follow-up-lifecycle.test.ts
  • packages/api/src/__tests__/follow-up-sweeper.test.ts
  • packages/api/src/__tests__/health-redirect.test.ts
  • packages/api/src/__tests__/host-fingerprint-pipeline.test.ts
  • packages/api/src/__tests__/issue-496-bare-list-handlers.test.ts
  • packages/api/src/__tests__/messages-send-reaction.test.ts
  • packages/api/src/__tests__/pgserve.test.ts
  • packages/api/src/__tests__/unified-messages.test.ts
  • packages/api/src/admin.ts
  • packages/api/src/app.ts
  • packages/api/src/cache/cache-keys.ts
  • packages/api/src/constants/__tests__/profiles.test.ts
  • packages/api/src/constants/__tests__/verbs.test.ts
  • packages/api/src/constants/profiles.ts
  • packages/api/src/constants/scopes.ts
  • packages/api/src/constants/verbs.ts
  • packages/api/src/index.ts
  • packages/api/src/lib/__tests__/close-contact-state.test.ts
  • packages/api/src/lib/__tests__/idempotency.test.ts
  • packages/api/src/lib/__tests__/verbs-to-scopes.test.ts
  • packages/api/src/lib/close-contact-state.ts
  • packages/api/src/lib/idempotency.ts
  • packages/api/src/lib/resolve-profile.ts
  • packages/api/src/lib/verbs-to-scopes.ts
  • packages/api/src/middleware/__tests__/genie-signature.test.ts
  • packages/api/src/middleware/__tests__/output-redactor.test.ts
  • packages/api/src/middleware/__tests__/require-signed-instance.test.ts
  • packages/api/src/middleware/__tests__/scope-enforcer-host-scopes.test.ts
  • packages/api/src/middleware/__tests__/scope-enforcer.test.ts
  • packages/api/src/middleware/auth.ts
  • packages/api/src/middleware/body-limit.ts
  • packages/api/src/middleware/genie-signature.ts
  • packages/api/src/middleware/output-redactor.ts
  • packages/api/src/middleware/require-signed-instance.ts
  • packages/api/src/middleware/scope-enforcer.ts
  • packages/api/src/pgserve.ts
  • packages/api/src/plugin-state.ts
  • packages/api/src/plugins/__tests__/agent-dispatcher-retry.test.ts
  • packages/api/src/plugins/__tests__/agent-dispatcher.test.ts
  • packages/api/src/plugins/__tests__/media-processor-persistence.test.ts
  • packages/api/src/plugins/__tests__/message-persistence-sent.test.ts
  • packages/api/src/plugins/__tests__/person-dedup.test.ts
  • packages/api/src/plugins/__tests__/session-cleaner.test.ts
  • packages/api/src/plugins/__tests__/sync-worker-history-push.test.ts
  • packages/api/src/plugins/agent-dispatcher.ts
  • packages/api/src/plugins/agent-responder.ts
  • packages/api/src/plugins/event-persistence.ts
  • packages/api/src/plugins/follow-up-hooks.ts
  • packages/api/src/plugins/instance-monitor.ts
  • packages/api/src/plugins/media-processor.ts
  • packages/api/src/plugins/message-persistence.ts
  • packages/api/src/plugins/session-cleaner.ts
  • packages/api/src/plugins/sync-worker.ts
  • packages/api/src/providers/deepseek/vision.test.ts
  • packages/api/src/providers/deepseek/vision.ts
  • packages/api/src/providers/gemini/client.ts
  • packages/api/src/providers/gemini/imagegen.test.ts
  • packages/api/src/providers/gemini/imagegen.ts
  • packages/api/src/providers/gemini/musicgen.test.ts
  • packages/api/src/providers/gemini/musicgen.ts
  • packages/api/src/providers/gemini/stt.test.ts
  • packages/api/src/providers/gemini/stt.ts
  • packages/api/src/providers/gemini/tts.test.ts
  • packages/api/src/providers/gemini/tts.ts
  • packages/api/src/providers/gemini/videogen.test.ts
  • packages/api/src/providers/gemini/videogen.ts
  • packages/api/src/providers/openai/imagegen.test.ts
  • packages/api/src/providers/openai/imagegen.ts
  • packages/api/src/providers/openai/stt.test.ts
  • packages/api/src/providers/openai/stt.ts
  • packages/api/src/providers/openai/tts.test.ts
  • packages/api/src/providers/openai/tts.ts
  • packages/api/src/providers/registry.ts
  • packages/api/src/providers/types.ts
  • packages/api/src/routes/health.ts
  • packages/api/src/routes/v2/__tests__/access-date-validation.test.ts
  • packages/api/src/routes/v2/__tests__/chats-id-jid-resolution.test.ts
  • packages/api/src/routes/v2/__tests__/chats-messages-after-validation.test.ts
  • packages/api/src/routes/v2/__tests__/close-contact-config.test.ts
  • packages/api/src/routes/v2/__tests__/context-route.test.ts
  • packages/api/src/routes/v2/__tests__/dead-letters-date-validation.test.ts
  • packages/api/src/routes/v2/__tests__/event-ops-date-validation.test.ts
  • packages/api/src/routes/v2/__tests__/events-date-validation.test.ts
  • packages/api/src/routes/v2/__tests__/instances-group-mutations.test.ts
  • packages/api/src/routes/v2/__tests__/keys-admin-route-guard.test.ts
  • packages/api/src/routes/v2/__tests__/keys-audit-date-validation.test.ts
  • packages/api/src/routes/v2/__tests__/messages-date-validation.test.ts
  • packages/api/src/routes/v2/__tests__/messages-send-media.test.ts
  • packages/api/src/routes/v2/__tests__/messages-send-reaction.test.ts
  • packages/api/src/routes/v2/__tests__/persons-timeline-date-validation.test.ts
  • packages/api/src/routes/v2/__tests__/settings-history-date-validation.test.ts
  • packages/api/src/routes/v2/__tests__/trust-handshake.test.ts
  • packages/api/src/routes/v2/_close-contact-config.ts
  • packages/api/src/routes/v2/a2a.ts
  • packages/api/src/routes/v2/access.ts
  • packages/api/src/routes/v2/agent-routes.ts
  • packages/api/src/routes/v2/agents.ts
  • packages/api/src/routes/v2/chats.ts
  • packages/api/src/routes/v2/context.ts
  • packages/api/src/routes/v2/dead-letters.ts
  • packages/api/src/routes/v2/event-ops.ts
  • packages/api/src/routes/v2/events.ts
  • packages/api/src/routes/v2/index.ts
  • packages/api/src/routes/v2/instances.ts
  • packages/api/src/routes/v2/keys.ts
  • packages/api/src/routes/v2/media.ts
  • packages/api/src/routes/v2/messages.ts
  • packages/api/src/routes/v2/persons.ts
  • packages/api/src/routes/v2/processed-events.ts
  • packages/api/src/routes/v2/providers.ts
  • packages/api/src/routes/v2/settings.ts
  • packages/api/src/routes/v2/trust.ts
  • packages/api/src/schemas/__tests__/date-query.test.ts
  • packages/api/src/schemas/date-query.ts
  • packages/api/src/schemas/openapi/agents.ts
  • packages/api/src/schemas/openapi/instances.ts
  • packages/api/src/schemas/openapi/providers.ts

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch dev

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request significantly expands the Omni platform's capabilities by introducing a Twilio WhatsApp channel plugin, a host-fingerprint trust system for Genie hosts, and an output-redactor middleware for data privacy. It also hardens the system with idempotency guards for NATS subscribers, transient error retries for agent dispatches, and improved date validation across API endpoints. Review feedback identified critical bugs including a missing import in the media processor and potential data loss in chats.settings and sync job progress updates due to non-merging object replacements. Furthermore, the polling mechanism for linking media to events was noted as an area for performance optimization.

const [event] = await ctx.db
.select({ id: omniEvents.id })
.from(omniEvents)
.where(eq(omniEvents.id, eventId))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The eq operator is used here but does not appear to be imported from drizzle-orm in this file. This will likely cause a compilation error unless it is imported in a part of the file not shown in the diff.

Comment on lines 1544 to 1546
await services.chats.update(data.chatId, {
settings: { agentPaused: true },
});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Accidental data loss in chats.settings. This update replaces the entire settings JSONB object with { agentPaused: true }, clobbering any existing metadata keys such as followUpConfig or closeOutcome. Per general rules, metadata updates should merge with existing data rather than replacing the entire object.

Suggested change
await services.chats.update(data.chatId, {
settings: { agentPaused: true },
});
const chat = await services.chats.getById(data.chatId);
await services.chats.update(data.chatId, {
settings: { ...((chat?.settings as Record<string, unknown>) ?? {}), agentPaused: true },
});
References
  1. When implementing an update command for a resource that has a metadata field, the update operation should merge the new metadata with the existing metadata rather than replacing it, to prevent accidental data loss of other metadata keys.

Comment on lines +1777 to +1784
await services.chats.update(data.chatId, {
settings: {
...(shouldPauseAgent ? { agentPaused: true } : {}),
closed: terminal,
closeUntil: closeUntil?.toISOString() ?? null,
closeOutcome: outcome,
} as Record<string, unknown>,
});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Accidental data loss in chats.settings. Similar to the handoff path, this update clobbers the entire settings JSONB column. Existing configuration keys like followUpConfig will be lost. You must fetch the current settings and merge them before updating.

Suggested change
await services.chats.update(data.chatId, {
settings: {
...(shouldPauseAgent ? { agentPaused: true } : {}),
closed: terminal,
closeUntil: closeUntil?.toISOString() ?? null,
closeOutcome: outcome,
} as Record<string, unknown>,
});
const chat = await services.chats.getById(data.chatId);
await services.chats.update(data.chatId, {
settings: {
...((chat?.settings as Record<string, unknown>) ?? {}),
...(shouldPauseAgent ? { agentPaused: true } : {}),
closed: terminal,
closeUntil: closeUntil?.toISOString() ?? null,
closeOutcome: outcome,
} as Record<string, unknown>,
});
References
  1. When implementing an update command for a resource that has a metadata field, the update operation should merge the new metadata with the existing metadata rather than replacing it, to prevent accidental data loss of other metadata keys.

Comment on lines +257 to +290
async function resolveSafeMediaContentEventId(
ctx: MediaProcessorContext,
eventId: string | undefined,
): Promise<string | null> {
if (!isUuid(eventId)) return null;

// media_content is audit/replay metadata, so do not block media.processed for long.
// Event persistence runs concurrently with this processor and should normally win within milliseconds.
const maxWaitMs = 250;
const pollMs = 50;
const deadline = Date.now() + maxWaitMs;

while (true) {
try {
const [event] = await ctx.db
.select({ id: omniEvents.id })
.from(omniEvents)
.where(eq(omniEvents.id, eventId))
.limit(1);

if (event) return event.id;
} catch (error) {
log.debug('Failed to validate media_content event FK', { eventId, error: String(error) });
return null;
}

if (Date.now() >= deadline) {
log.debug('Skipping media_content event FK; omni_event not found', { eventId });
return null;
}

await new Promise((resolve) => setTimeout(resolve, pollMs));
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The polling loop in resolveSafeMediaContentEventId is an inefficient way to handle the race condition between event persistence and media processing. While 250ms is a short window, this pattern results in multiple unnecessary database queries and holds up the event handler. Consider using a database-level constraint or a more reactive approach (e.g., a background worker or a retry mechanism in the event bus) to link media content to events.

Comment thread packages/api/src/plugins/sync-worker.ts Outdated
Comment on lines +959 to +961
const update: Partial<SyncJobProgress> = {
fetched: payload.fetched ?? 0,
stored: 0,
duplicates: 0,
mediaDownloaded: 0,
totalEstimated:
payload.progress && payload.progress > 0
? Math.round((payload.fetched ?? 0) / (payload.progress / 100))
: 0,
});
};

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Potential counter reset bug. If payload.fetched is undefined, it defaults to 0, which will clobber the existing fetched count in the database during the updateProgress merge. The update object should only include the fetched key if the value is explicitly provided in the payload to avoid accidental data loss.

          const update: Partial<SyncJobProgress> = {};
          if (payload.fetched !== undefined) {
            update.fetched = payload.fetched;
          }
References
  1. When implementing an update command for a resource that has a metadata field, the update operation should merge the new metadata with the existing metadata rather than replacing it, to prevent accidental data loss of other metadata keys.

@namastex888

Copy link
Copy Markdown
Contributor

@codex review this pr

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3e765a0b4d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +245 to +253
let instance: string | null = instanceId;
let chat: string | null = chatId;
let recipient: string | null = null;

// Path-param extraction: /instances/:id, /chats/:id
const pathInstance = PATH_INSTANCE_PREFIXES.map((p) => firstPathSegment(cleanPath, p)).find((v) => v != null) ?? null;
const pathChat = PATH_CHAT_PREFIXES.map((p) => firstPathSegment(cleanPath, p)).find((v) => v != null) ?? null;
if (!instance && pathInstance) instance = pathInstance;
if (!chat && pathChat) chat = pathChat;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Prioritize path instance/chat targets over JSON body

The target extraction currently trusts instanceId/chatId from the request body before path params, and only falls back to path values when the body omits them. Because requireSignedInstanceMiddleware and allowlist checks both consume extractLockTargets, a caller can hit a path-scoped endpoint like PATCH /api/v2/instances/:id and inject a different instanceId in JSON to make authorization decisions against the wrong instance/chat. This can bypass require_genie_signature and allowlist enforcement for path-targeted writes.

Useful? React with 👍 / 👎.

Comment thread packages/api/src/routes/v2/messages.ts Outdated
agentId: instance.agentId ?? null,
outcome,
reason: data.reason ?? null,
escalated: terminal,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Publish actual escalation flag in chat.closed payload

The emitted chat.closed event sets escalated to terminal instead of the computed escalated value, so hard-terminal outcomes (won/lost) are always reported as escalations even when no threshold-based auto-promotion happened. This corrupts downstream audit/BI semantics and any automation that distinguishes explicit hard closes from repeated soft-close escalation.

Useful? React with 👍 / 👎.

github-actions Bot and others added 25 commits May 7, 2026 20:57
bun publish started returning `404 Not Found` for scoped packages on
2026-05-07 even though npm whoami via the same NPM_TOKEN succeeds and
the token has read+write access to @automagik/omni. The last successful
publish via `bun publish` was 2.260506.1 on 2026-05-06 15:06; every
publish since fails with the same error pattern:

  Tag: next
  Access: public
  Registry: https://registry.npmjs.org/
  404 Not Found: https://registry.npmjs.org/@automagik%2fomni
   - '@automagik/omni@<v>' does not exist in this registry

Workflow code, bun version (1.3.13), and registry URL all unchanged
across the success→failure boundary, so this is most likely a bun
publish auth-header regression for scoped packages when given env-var
auth only (no .npmrc written).

Switch to the canonical npm setup:
1. Write ~/.npmrc with the registry's _authToken
2. Use `npm publish` (rock-solid, well-instrumented error messages)

Net behavior preserved: same package, same scope, same tag, same
access mode. Only the publisher binary + auth wiring changed.
…ublish

ci(version): use npm publish + explicit .npmrc to fix scoped 404
Drop NPM_TOKEN dependency entirely. Mirror the proven pattern from
automagik/genie .github/workflows/version.yml — short-lived OIDC tokens
exchanged at workflow runtime, no long-lived secret to rotate, no 2FA-
bypass gymnastics.

Why now:
The classic NPM_TOKEN path silently 404'd starting 2026-05-06 evening.
Diagnosis showed token authenticated fine for whoami + reads; PUT (publish)
returned 404 — npm's stand-in for "this token can't bypass 2FA on publish
for this scope". Granular access tokens don't bypass 2FA by default.

Fix: stop using tokens. npm "Trusted Publishers" trusts the workflow
identity directly via OIDC. Felipe saved the trusted-publisher record
on npmjs.com pointing at automagik-dev/omni workflow version.yml (no env)
in parallel with this PR.

Three workflow changes, all mirroring genie's version.yml verbatim where
applicable:

1. Top-level permissions: + id-token: write
   (required to mint OIDC tokens at runtime)

2. New step before publish: Upgrade npm to >= 11.5.1
   Node 22 bundles npm 10.x which sends a placeholder OIDC token and
   gets back a misleading 404 — fixed in npm 11.5.1+. Install latest.

3. Publish step rewrite:
   - Removed: NPM_TOKEN env var, .npmrc write, skip-guard, NPM_CONFIG_TOKEN
   - Added: NPM_CONFIG_PROVENANCE=false (Blacksmith runners fail sigstore
     attestation with 422 even though OIDC token exchange itself works)
   - Kept: HUSKY=0, the cd packages/cli && npm publish invocation

Comment block above the publish step calls out the 3 traps a future
maintainer would otherwise rediscover the hard way:
  - npm 10 placeholder-token 404
  - Blacksmith provenance 422
  - workflow filename binding to the npmjs.com trusted-publisher record

NPM_TOKEN repo secret left in place during transition (defense in depth).
Cleanup is a follow-up wish after ≥3 successful OIDC publishes.

Wish: .genie/wishes/npm-trusted-publisher-oidc/WISH.md (bundled in this PR)
…-oidc

ci(version): switch npm publish to OIDC trusted publishing
The "Upgrade npm for OIDC trusted publishing" step (npm install -g
npm@latest) writes to /usr/local. omni's Blacksmith pool image rejects
this with EACCES on /usr/local/share/man/man7, blocking the publish
step that follows.

GitHub-hosted ubuntu-latest gives the runner user write access to
/usr/local, so npm -g installs cleanly. Trade-off: slower job startup
(~30s vs. Blacksmith's ~5s) for publish reliability — acceptable for a
once-per-PR workflow.

Note: genie's identical workflow runs successfully on Blacksmith. The
EACCES is specific to omni's Blacksmith image — likely a /usr/local
ownership/perm difference between the two pools' base images.

Whole-job switch (vs. splitting into bump-on-Blacksmith + publish-on-
GitHub) keeps the workflow simple — no artifact passing between jobs.
…hosted

ci(version): use ubuntu-latest for Auto Version (Blacksmith /usr/local EACCES blocks OIDC)
The previous attempt (#615 — switch to ubuntu-latest) didn't fix it.
Both Blacksmith and ubuntu-latest's bundled Node lives at /usr/local
with permissions the runner user can't write to. `npm install -g
npm@latest` therefore fails with EACCES on /usr/local/share/man/man7
on either runner.

setup-node@v5 installs Node 22 into the runner-user-writable
tool_cache. Subsequent `npm -g` writes go to a user-owned path. No
EACCES, no runner gymnastics.

Keeping ubuntu-latest from #615 — Felipe's call that OIDC is unstable
on Blacksmith. setup-node is the actual EACCES fix on top.
ci(version): add actions/setup-node to fix EACCES on npm -g install
ci(version): add --force to npm self-upgrade
…grade

ci(version): use Node 24 to skip broken npm self-upgrade
`buildPm2StartArgs` did not pass `--cwd`, so pm2 inherited the calling
shell's cwd and baked it into the omni-api / omni-nats entries. When
that directory was later removed (transient build dir, deleted
submodule, switched git branch), every subsequent `pm2 restart`
failed silently with `Error: spawn bash ENOENT` because pm2 chdir'd
into the missing path before exec.

The 2026-05-07 incident: `omni update --next` ran from a shell whose
cwd was `repos/pgserve` (a now-removed canonical-pgserve experiment
directory). `startServices` did `pm2 delete omni-api && pm2 start ...`
without `--cwd`, baking that path. Subsequent restarts (from the
SessionStart hook, from `omni doctor --fix`, from pm2 max_restarts
backoff) all failed with ENOENT and zero log output, because the
launcher bash never even started. doctor's `cli-key-valid` rotation
path then dutifully reported "rotated key does not validate" because
the API it was validating against had no listener at all.

The omni-server launcher (`bin/omni-server`) does its own
`cd "$(dirname "$0")/../dist/server"` at line 6, so pm2's cwd does
not need to point anywhere meaningful — it just needs to point
somewhere that exists for the lifetime of the user account. `$HOME`
is the obvious anchor.

Adds:
- `getPm2AnchorCwd()` in `pm2.ts` returning `homedir()`.
- `--cwd <anchor>` injected by `buildPm2StartArgs` for both api and
  nats launches, so `startServices` (install/update),
  `fixPgserveCanonical`, `fixPm2EnvDrift`, and `fixPm2MaxRestarts`
  all benefit from a single change.
- Regression tests asserting `--cwd` is present and equals `$HOME`
  for both api and nats argv.
fix(cli): anchor pm2 --cwd to $HOME so restarts cannot ENOENT
Regression introduced by 01b5511 (`ci(version): switch npm publish to
OIDC trusted publishing`, 2026-05-07): only `version.yml` was migrated;
`release.yml` continued using `NPM_TOKEN`. Once npm Trusted Publishing
was registered for `@automagik/omni` (so version.yml could publish via
OIDC), the package's legacy-token publish path stopped accepting
`NPM_TOKEN` and started returning a misleading 404 — exactly the trap
version.yml's own comment warns about:

  > Without [id-token: write], the publish step gets an empty
  > placeholder token and the registry returns a misleading 404
  > instead of the real auth failure.

Symptom from the failing run:

  Tag: latest
  Access: public
  Registry: https://registry.npmjs.org/
  404 Not Found: https://registry.npmjs.org/@automagik%2fomni
  '@automagik/omni@2.260508.1' does not exist in this registry
  Error: Process completed with exit code 1

The 404 fired during the `npm dist-tag add` retag fallback, after
`bun publish` already failed because the package's only authorized
publisher is now the OIDC Trusted Publisher entry pointing at
version.yml — release.yml was effectively unauthenticated.

Fix mirrors the version.yml pattern verbatim:

- Add `id-token: write` permission so the runner can mint OIDC tokens.
- Switch `runs-on` from `blacksmith-4vcpu-ubuntu-2404` to
  `ubuntu-latest`. Setup-node's tool_cache resolves under /usr/local
  on the Blacksmith image where the runner user gets EACCES; GitHub-
  hosted runners give the runner user write access there. (Same
  trade-off documented on version.yml.)
- Add `actions/setup-node@v5` with `node-version: '24'` and
  `registry-url: 'https://registry.npmjs.org'`. Node 24 ships
  npm 11.x; Trusted Publishing requires npm >= 11.5.1.
- Replace `bun publish` with `npm publish`. Drop the manual
  `~/.npmrc` write and the NPM_TOKEN env vars — OIDC handles auth
  for both `npm publish` AND the `npm dist-tag add` retag fallback.
- Set `NPM_CONFIG_PROVENANCE: "false"` (npm auto-enables provenance
  whenever id-token is writable; sigstore attestation 422s on
  GitHub-hosted runners for this package).

IMPORTANT: this file is now a publisher of @automagik/omni from npm's
perspective. Both version.yml AND release.yml must be registered as
Trusted Publishers at https://www.npmjs.com/package/@automagik/omni/access.
The included comment block on the auth step calls this out so a future
rename does not silently 404 again.
ci(release): migrate npm publish to OIDC Trusted Publishing
moraisdev and others added 6 commits June 15, 2026 22:47
…ied-entry-flow-payload

fix(channel-gupshup): accept simplified HV-Entry-Flow payload
Follow-up to #707.

- Synthesized dedupe id now uses a content digest (FNV-1a) instead of
  text length, so two distinct messages from the same sender in the same
  second no longer collide and get one silently dropped as a duplicate.
- normalizeSimplifiedWebhook returns null when there's no text, so an
  empty simplified payload drops instead of dispatching an empty inbound.
- Tests: distinct same-second equal-length ids stay distinct; no-text
  payloads normalize to null.
…upe-collision

fix(channel-gupshup): harden simplified-payload id + drop empty text
Addresses three P1 findings from the #589 bot review:

extractLockTargets resolved authorization targets from path params and the
JSON body only, with the body taking precedence. This allowed:
  • Scope bypass — a caller could hit a path-scoped write (e.g.
    PATCH /instances/:realId) and inject body.instanceId pointing at an
    allowlisted instance, so allowlist + require_genie_signature checks
    evaluated against the wrong instance.
  • Missed enforcement / false 403 — header-scoped routes (POST /turns/close
    targets via x-omni-instance / x-omni-chat) were invisible to the
    extractor, so signature enforcement was skipped and allowlisted profile
    keys were wrongly denied.

Fix:
  • extractLockTargets now also reads x-omni-instance / x-omni-chat (via the
    new readHeaderTargets helper) and applies precedence path > header > body.
    Route-derived targets (path, header) are trusted; the caller-controllable
    body can no longer override them and is used only as a last resort.
  • require-signed-instance + scope-enforcer middleware both pass header
    targets through.

Tests: path/header beat conflicting body targets; headers populate targets
for header-scoped routes; body still used when no path/header target.
vasconceloscezar and others added 5 commits June 17, 2026 10:38
…targets-precedence

fix(api): close scope-enforcer body-injection + header-target gaps (P1)
The 'is synchronous and fast (<1ms)' test asserted 1000 recordCheckpoint
calls finish in <100ms wall-clock. That's an environmental perf budget, not a
behavioral guarantee — it fails intermittently on a loaded CI host (observed
110ms) while passing in isolation.

Rewritten to assert the actual intent: recordCheckpoint returns synchronously
(no thenable) and completes its work in-call (journey present immediately
after), with a generous 1000ms ceiling kept only as a regression backstop.
…er-timing

test(core): de-flake JourneyTracker hot-path timing assertion
Addresses #589 review findings #5, #6, #7 (verified still-present on dev).

Settings clobber (#5, #6 + a third matching site):
  Several writes set chats.settings to a freshly-built object, which replaces
  the entire JSONB column and drops unrelated keys (followUpConfig, close*,
  agentResumedAt, …). Fixed to merge over the existing settings — matching the
  established pattern in reopen-contact and PUT /follow-up/chats/:id:
    • POST handoff (agentPaused: true)
    • POST close-contact (agentPaused/closed/closeUntil/closeOutcome)
    • clear-session resume (agentPaused: false) — same bug, not in the review

chat.closed escalated flag (#7):
  The event published escalated: terminal, so hard-terminal closes (won/lost,
  which are terminal:true but escalated:false) were always reported as
  escalations, corrupting BI/audit + any automation distinguishing hard closes
  from threshold escalations. Now publishes the computed escalated value (the
  HTTP response already returned the correct one).

Service-layer merge was rejected on purpose: DELETE /follow-up/chats/:id relies
on a wholesale replace to drop a key, so merging must stay at the call sites.
vasconceloscezar and others added 12 commits June 17, 2026 10:58
…d-escalated-flag

fix(api): merge chat.settings on writes + publish real escalated flag
…n hint (#722)

Partial fix for #722 (embedded->canonical upgrade break). Two operator-facing
gaps fixed; the host-tooling-free data copy is a separate follow-up.

#1 fail-fast (root cause: opaque crash loop):
  The dev API only fail-fasts on PGSERVE_EMBEDDED=true, which legacy embedded
  installs (pre-canonical-cutover) never set — so they fell into the API's
  30x 'Database not ready' retry crash loop with no guidance. `omni start`
  now detects the legacy-embedded shape (no useCanonicalPgserve flag + an
  embedded data dir present) and exits with the `omni doctor --fix` hint
  before starting omni-api. Extracted legacyEmbeddedNeedsMigration() + tests.

#3 misleading remediation hint:
  dumpEmbeddedDb/restoreSnapshotToCanonical told operators to
  `apt install postgresql-client`, which on Ubuntu 24.04 is PG16 and refuses
  to dump a PG18 server. New clientToolHint() names the REQUIRED major from
  the embedded cluster's PG_VERSION (e.g. postgresql-client-18 / brew
  postgresql@18) and warns the bare package is often too old.

Validated end-to-end in an isolated sandbox: omni start now fails fast with
exit 1 + the migration hint; doctor --fix surfaces the major-aware client hint.

Refs #722.
…-migration

fix(cli): fail-fast on legacy embedded pgserve + major-aware migration hint (#722)
…s.js (#722)

The embedded->canonical migration copied data by shelling out to host
psql/pg_dump — tools omni doesn't bundle and that, when present from a distro,
are often an older major than the embedded server (PG18) and refuse to run.
That's the root of the #722 upgrade break.

Replace the psql transport with postgres.js COPY streams (a dependency the
repo already ships in @omni/db):
  - pgConnect() opens a wire connection to the temp embedded reader + canonical
  - listTables/listColumns/copyTable/resetSequences now run over the wire
  - copyTable streams COPY ... TO STDOUT -> COPY ... FROM STDIN with pipeline()
    backpressure (kills the EPIPE-on-large-rows failure the old psql->file
    buffering worked around)
  - one reserved dst connection holds session_replication_role='replica'
    across TRUNCATE + every COPY + sequence resets (it's a session GUC)

Net: migrateUnmountedEmbeddedToCanonical + compareEmbeddedVsCanonicalCounts
need NO host client tools — only the matching-major server reader, already
auto-fetched. Verified in a sandbox: postgres.js text-COPY round-trips
jsonb/bytea/int[]/nulls byte-for-byte.

Extracted resolveTablesToMigrate/copyAllTables/countRowDivergence to stay under
the cognitive-complexity ceiling. Existing migration + doctor tests green.

Follow-up (same issue): wire the primary fixPgserveCanonical path
(schema-via-migrations + this copier) so `omni doctor --fix` is fully
automatic; today it still uses the pg_dump/psql snapshot path.

Refs #722.
Synchronous + instant, but the default 5s per-test timeout flakes under heavy
host load (CPU starvation inflates wall-clock >5s), intermittently failing the
full-suite pre-push gate. 30s headroom — same class of fix as the JourneyTracker
de-flake.
…n-pgjs

refactor(cli): host-tooling-free embedded→canonical migration via postgres.js (#722)
…ing-free) (#722)

Rewire fixPgserveCanonical off the pg_dump/psql snapshot path onto the
host-tooling-free postgres.js copier (#724). New flow:
  stop omni-api → pgserve install → persist canonical config →
  delete+start omni-api on canonical (startup runs drizzle migrations →
  schema) → wait for /health → copy data via migrateEmbeddedData (postgres.js
  COPY over the wire).

The schema is created by omni-api's own migrations, so the copy is data-only.
No pg_dump/psql, no client-major mismatch, no manual steps — a fresh
`omni doctor --fix` now completes the embedded→canonical migration on its own.

Failure semantics for the new architecture (this build can't run embedded):
  - setup fails (pre-flip): restart omni-api on embedded, FAIL.
  - after the config flip canonical is the target; a health-timeout or copy
    failure leaves canonical healthy/empty with the embedded data INTACT, and
    the idempotent embedded-data-orphaned check re-runs the copy next time.

Added DoctorDeps.migrateEmbeddedData (stubbable) + waitForApiHealthy. Rewrote
the pgserve-canonical fix tests for the new ordering + failure modes
(dump/restore are no longer invoked; copy-skip, health-timeout, copy-throw
covered). dumpEmbeddedDb/restoreSnapshotToCanonical remain for now (still
referenced by update-maintenance tests) but are no longer used by the fix.

Orchestration validated by 49 doctor unit tests; the postgres.js COPY engine
was proven byte-for-byte in #724.

Closes #722.
The clean-room e2e surfaced two more psql shell-outs in the canonical setup
path that the doctor-fix rewire depends on — both assumed `psql` is on PATH,
which is false on a fresh host (autopg/pgserve ship only initdb/pg_ctl/postgres,
no client tools). Bun.spawn throws on a missing executable, aborting the
migration with 'Executable not found in $PATH: psql'.

Converted both to postgres.js (already a CLI dep):
  - canonical-pgserve.ts ensureOmniDatabaseExists: CREATE DATABASE via wire,
    42P04 → already-exists → success (idempotent).
  - role-cutover.ts runProvisioningSql (was runPsql): runs the role-create +
    grants scripts as a simple query over the local unix socket as postgres
    superuser; fails-fast on first error (ON_ERROR_STOP equivalent).

With these, `omni doctor --fix` needs ZERO host postgres client tools.

Validated end-to-end in a clean container: @latest embedded install + seeded
canary → upgrade to patched @next → `omni doctor --fix` → 13 OK / 0 FAIL,
omni-api healthy on canonical, 30 tables copied, canary instance preserved
(identical UUID).

Refs #722.
fix(cli): fully automatic, host-tooling-free omni doctor --fix migration (#722)
github-actions Bot and others added 4 commits June 18, 2026 18:13
…#722)

Clean-room re-validation of the update-from-dev path surfaced a first-pass race:
when `pgserve install` provisions canonical DURING `omni doctor --fix`, the
freshly-started postmaster may not accept connections for a second or two.
ensureOmniDatabaseExists + the role-cutover connect immediately after, the
connect is refused, both no-op (best-effort), and omni-api is left crash-looping
on a missing `omni` db — the operator had to run `omni doctor --fix` a second
time for it to succeed.

Add waitForCanonicalReady(port): poll the postmaster (postgres.js SELECT 1, up
to ~30s) after `pgserve install` and before db/role provisioning.

Validated in a fresh container: @latest embedded + seeded canary → update to
this build → a SINGLE `omni doctor --fix` migrates 30 tables to canonical,
omni-api healthy on canonical, canary preserved. (Previously needed two passes.)

Refs #722.
…s-wait

fix(cli): wait for canonical postmaster readiness before provisioning (#722)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants